A robust feature extraction based on the MT reverberant envir
نویسندگان
چکیده
This paper proposes a robust feature extraction method for automatic speech recognition (ASR) systems in reverberant environment. In this method, a sub-band power envelope inverse filtering algorithm based on the modulation transfer function (MTF), that we have previously proposed, is incorporated as a front-end processor for ASR. The impulse response of the room acoustics is assumed to be exponential decay modulated white noise, and speech is assumed to be temporal modulated white noise in each sub-band. Therefore, the impulse response of the environment does not need to be measured. Testing demonstrated that this algorithm can restore the temporal power envelope of reverberant speech in subbands and thus reduce the loss of speech intelligibility caused by reverberation. Testing of its ability to recognize digitized Japanese speech was done by using reverberant speech created by simple convolution of the room acoustics and speech. The algorithm had a 32.1% higher error reduction rate (on average, for reverberation times from 0.1 to 2.0 s) compared with the traditional cepstral mean normalization (CMN) of the auditory power spectrum based method (AFCC).
منابع مشابه
Combining Mllr Adaptation and Feature Extraction for Robust Speech Recognition in Reverberant Environments
This paper presents an investigation on speech recognition performance in reverberant environments. Reverberant noise has been a major concern in speech recognition systems. Many speech recognition systems, even with state-of-art features, fail to respond to reverberant effects and the recognition rate deteriorates. This shows the limitations of robust feature extraction in reverberant environm...
متن کاملRobust Asr in Reverberant Environments Using Temporal Cepstrum Smoothing for Speech Enhancement and an Amplitude Modulation Filterbank for Feature Extraction
This paper presents techniques aiming at improving automatic speech recognition (ASR) in single channel scenarios in the context of the REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge. System improvements range from speech enhancement over robust feature extraction to model adaptation and word-based integration of multiple classifiers. The selective temporal cepstrum ...
متن کاملRobust Feature Extraction for Speech Recognition by Enhancing Auditory Spectrum
The goal of this work is to improve the robustness of speech recognition systems in additive noise and real-time reverberant environments. In this paper we present a compressive gammachirp filter-bank-based feature extractor that incorporates a method for the enhancement of auditory spectrum and a shorttime feature normalization technique, which, by adjusting the scale and mean of cepstral feat...
متن کاملRobust feature extraction based on an asymmetric level-dependent auditory filterbank and a subband spectrum enhancement technique
In this paper we introduce a robust feature extractor, dubbed as robust compressive gammachirp filterbank cepstral coefficients (RCGCC), based on an asymmetric and level-dependent compressive gammachirp filterbank and a sigmoid shape weighting rule for the enhancement of speech spectra in the auditory domain. The goal of this work is to improve the robustness of speech recognition systems in ad...
متن کاملTime-Varying Autoregressions for Speaker Verification in Reverberant Conditions
In poor room acoustics conditions, speech signals received by a microphone might become corrupted by the signals’ delayed versions that are reflected from the room surfaces (e.g. wall, floor). This phenomenon, reverberation, drops the accuracy of automatic speaker verification systems by causing mismatch between the training and testing. Since reverberation causes temporal smearing to the signa...
متن کامل